home *** CD-ROM | disk | FTP | other *** search
- Title: RECIO DESIGN AND DEVELOPMENT NOTES
- Copyright: (C) 1994 William Pierpoint
- Version: 2.00
- Date: April 15, 1994
-
-
-
- 1.0 DATA STRUCTURES
-
- 1.1 REC structure for each record stream
-
- * defined in recio.h.
- * one static REC for recin (included in ROPEN_MAX count).
- * allocate dynamic array of RECs dimensioned to ROPEN_MAX-NREC in ropen().
- * Each REC has two associated buffers:
- 1) record string buffer containing current record;
- allocate when first record read;
- reallocate if record becomes larger.
- 2) field string buffer containing current field;
- allocate when first field read;
- reallocate if field becomes larger.
- * deallocate dynamic RECs and associated buffers in rclose() and
- rcloseall() if all record streams closed; deallocate associated
- buffers for recin with an exit function registered with atexit().
-
-
- 1.2 REC r_flags assignments
-
- Bit Description
- ----- -------------------------------------------------------------
- 0 If clear, colno start at 0; if set, colno start at 1
- 1 if clear, read mode; if set, write/append mode
- 2-6 Reserved
- 7 If clear, EOF not reached; if set, EOF reached
- 8-11 If clear, no error; else, rerror number
- 12-15 if clear, no warning; else, rwarning number
-
-
- 1.3 Accessing REC Members and Associated Buffers
-
- How do I
- * access the name of the record stream? rnames()
- * access the current context number? rcxtno()
- * access the current record number? rrecno()
- * access the current field number? rfldno()
- * access the current column number? rcolno()
- * access the record string buffer? rrecs()
- * access the field string buffer? rflds()
- * determine if column numbers start at 0 or 1 rbegcolno()
- * determine if there are more records left? reof()
- * determine if there is an error on the stream? rerror()
- * determine if there is a warning on the stream? rwarning()
- * access the error message for the stream? rerrstr()
- * force an error on a record stream? rseterr()
- * clear an error on a record stream? rclearerr()
- * increase the size of the record string buffer? rsetrecsiz()
- * increase the size of the field string buffer? rsetfldsiz()
- * replace the data in the field string buffer? rsetfldstr()
- * set the field delimiter character? rsetfldch()
- * set the text delimiter character? rsettxtch()
- * set the context number? rsetcxtno()
- * set column numbering to start at 0 or 1? rsetbegcolno()
- * check for incoming empty data strings? rsetstrattr()
-
-
-
- 2.0 CODE STRUCTURES
-
- 2.1 Functional Decomposition
-
- ╔═════════╗
- ║ recio.c ║
- ╚═╤══╤══╤═╝
- │ │ │
- ╔════════╗ input │ │ │ output ╔════════╗
- ║ rget.c ╟───────────┘ │ └──────────╢ rput.c ║
- char ╚═╤════╤═╝ column │ char ╚═╤════╤═╝ column
- delimited │ │ delimited │ delimited │ │ delimited
- ╔═════════╗ │ │ ╔══════════╗ │ ╔═════════╗ │ │ ╔══════════╗
- ║ rgets.c ╟─┤ ├─╢ rcgets.c ║ │ ║ rputs.c ╟─┤ ├─╢ rcputs.c ║
- ╚═════════╝ │ │ ╚══════════╝ │ ╚═════════╝ │ │ ╚══════════╝
- │ │ │ │ │
- ╔═════════╗ │ │ ╔══════════╗ │ ╔═════════╗ │ │ ╔══════════╗
- ║ rgetf.c ╟─┤ ├─╢ rcgetf.c ║ │ ║ rputf.c ╟─┤ ├─╢ rcputf.c ║
- ╚═════════╝ │ │ ╚══════════╝ │ ╚═════════╝ │ │ ╚══════════╝
- │ │ │ │ │
- ╔═════════╗ │ │ ╔══════════╗ │ ╔═════════╗ │ │ ╔══════════╗
- ║ rbget.c ╟─┘ └─╢ rcbget.c ║ │ ║ rbput.c ╟─┘ └─╢ rcbput.c ║
- ╚═════════╝ ╚══════════╝ │ ╚═════════╝ ╚══════════╝
- │
- ╔══════════╗ │ ╔═════════╗ ╔══════════╗
- ║ rwarn.c ╟───┴───╢ rerr.c ╟────────╢ rfix.c ║
- ╚══════════╝ ╚═════════╝ ╚══════════╝
-
-
- 2.2 Callback Error Function Skeleton
-
- if valid record pointer [risvalid(rp)]
- if past end of file [reof(rp)] (if reof test removed, past EOF will
- else [error number set] become R_EMISDAT or R_WEMPSTR)
- switch error number [rerror(rp)]
- case read data errors [R_ERANGE || R_EINVDAT || R_EMISDAT]
- case write data errors [R_ENOPUT || R_EWIDTH]
- switch context number [rcxtno(rp)]
- case RECIN
- switch field number [rfldno(rp)]
- case 1 (first field read)
- case 2 (second field read)
- ...
- endcase
- ...
- default [missing or unknown context number]
- endcase
- case out of memory [R_ENOMEM]
- case fatal errors [R_EINVAL || R_EINVMOD]
- default [possibly set by application with rseterr()]
- endcase
- endif
- else [invalid record pointer]
- switch error number [errno]
- case out of memory [ENOMEM]
- case out of record or file pointers [EMFILE]
- case permission denied [EACCES]
- case fatal errors [EINVAL]
- default [possibly set by application with rseterr()]
- endcase
- endif
-
-
- 2.3 Callback Warning Function Skeleton
-
- if valid record pointer [risvalid(rp)]
- switch warning number [rwarning(rp)]
- case data string empty [R_WEMPSTR]
- case atexit fn full [R_WNOREG]
- case data too wide for columns [R_WWIDTH]
- default [possibly set by application with rsetwarn()]
- endcase
- endif
-
- 2.4 Classes of Field Functions
-
- There are eight classes of field functions:
-
- rget - input character delimited field, base 10 if numeric field
- rbget - input numeric character delimited field, base 2-36
- rcget - input column delimited field, base 10 if numeric field
- rcbget - input numeric column delimited field, base 2-36
- rput - output character delimited field, base 10 if numeric field
- rbput - output numeric character delimited field, base 2-36
- rcput - output column delimited field, base 10 if numeric field
- rcbput - output numeric column delimited field, base 2-36
-
-
- 2.5 How to Define and Declare New Numeric Field Functions
-
- You can define a new function to input or output numeric data using one
- of these macros:
-
- macro: macro defined in: define new function in:
- ----------- ----------------- -----------------------
- rget_fn() _rgetf.h rgetf.c
- rbget_fn() _rbget.h rbget.c
- rcget_fn() _rcgetf.h rcgetf.c
- rcbget_fn() _rcbget.h rcbget.c
- rput_fn() _rputf.h rputf.c
- rbput_fn() _rbput.h rbput.c
- rcput_fn() _rcputf.h rcputf.c
- rcbput_fn() _rcbput.h rcbput.c
-
- macro: declare new function in recio.h as:
- ----------- ---------------------------------------------------------
- rget_fn() rget?(REC *rp);
- rbget_fn() rbget?(REC *rp, int base);
- rcget_fn() rcget?(REC *rp, size_t begcol, size_t endcol);
- rcbget_fn() rcbget?(REC *rp, size_t begcol, size_t endcol, int base);
- rput_fn() rput?(REC *rp, datatype);
- rbput_fn() rbput?(REC *rp, int base, datatype);
- rcput_fn() rcput?(REC *rp, size_t begcol, size_t endcol, datatype);
- rcbput_fn() rcbput?(REC *rp, size_t begcol, size_t endcol, int base,
- datatype);
- where ? is one or more new unique letters
- and datatype is the data type (e.g. integer)
-
- rbget_fn and rcbget_fn define integral number input functions
- -----------------------------------------------------------------------
- fn_type defined function return type
- fn_name defined function name
- fn_err defined function error return value
- cv_type conversion function return type
- cv_name conversion function name
- fn_min inclusive valid minimum value (overflow limit)
- fn_max inclusive valid maximum value (overflow limit)
-
- rget_fn and rcget_fn define floating point input functions
- -----------------------------------------------------------------------
- fn_type defined function return type
- fn_name defined function name
- fn_err defined function error return value
- cv_type conversion function return type
- cv_name conversion function name
- fn_negmin inclusive valid negative minimum value (overflow limit)
- fn_negmax inclusive valid negative maximum value (underflow limit)
- fn_posmin inclusive valid positive minimum value (underflow limit)
- fn_posmax inclusive valid positive maximum value (overflow limit)
-
- rbput_fn and rcbput_fn define integral number output functions
- -----------------------------------------------------------------------
- fn_type defined function data type
- fn_name defined function name
- cv_type conversion function cast
- cv_name conversion function name
-
- rput_fn and rcput_fn define floating point output functions
- -----------------------------------------------------------------------
- fn_type defined function data type
- fn_name defined function name
- cv_type conversion function cast
- cv_name conversion function name
- cv_dig conversion function number of significant digits
-
- The commonly used conversion functions are:
-
- name: return type:
- ------- --------------------------------------------------------
- strtol long
- strtoul unsigned long
- str2ul unsigned long (tests for invalid negative numbers)
- strtod double
- str2c character
-
- name: cast:
- ------ ----------------
- itoa int
- ltoa long
- ultoa unsigned long
-
- portability note: functions starting with str are part of ansi-c
- reserved namespace
-
- Example: suppose you want to define a function rgetb() that gets a
- boolean value (unsigned char) and generates an ERANGE error
- if the value is not 0 or 1:
-
- /* definition to add to rget.c */
- rget_fn(unsigned char, rgetb, 0, long, strtol, 0, 1)
-
- /* declaration to add to recio.h */
- rgetb(REC *rp);
-
- --OR to generate an EINVDAT error if the value is not 0 or 1--
-
- /* definition to add to rbget.c */
- rbget_fn(unsigned char, rbgetb, 0, unsigned long, str2ul, 0, 1)
-
- /* declaration to add to recio.h */
- rbgetb(REC *rp, int base);
-
- /* macro to add to recio.h */
- #define rgetb(rp) (rbgetb((rp), 2))
-
-
-
- 3.0 DEVELOPMENT NOTES
-
- 3.1 fgets (Microsoft C 5.1)
-
- Previous notes of mine indicate that Microsoft's fgets function does not
- work correctly when it reads a line of text that consists of only a newline.
- However this can be worked around by first setting the string buffer to an
- empty string. If you plan on retaining the newline, you will need to test
- this further. The fgets function is used twice in the rgetrec function.
- If porting to Microsoft C, you may need to implement this fix in recio.c:
-
- *rrecs(rp) = '\0'; /* just prior to the first fgets (added v1.20) */
- *str = '\0'; /* just prior to the second fgets */
-
-
- 3.2 fopen (Borland C 3.1)
-
- fopen() calls __openfp() calls open(). Borland's "Library Reference"
- documents error numbers for open(), but not for fopen(). These error
- numbers are ENOENT, EMFILE, EACCES, and EINVACC. Because ropen() screens
- the access code, the EINVACC error will not occur from the recio library.
-
-
- 3.3 strtol & strtoul (Borland C 3.1)
-
- These functions stop consuming input once they overflow, setting ERANGE.
- Hence endptr can point into the middle of a sequence of valid characters
- having the expected form as given in ANSI X3.159-1989, Sections 4.10.1.5
- and 4.10.1.6. IMHO this characteristic is not in conformance with the
- ANSI standard as endptr should only point to the first unrecognized
- character or to the terminating null character. Borland's strtod does
- not have this problem.
-
- Note that ANSI X3.159-1989 Section 4.10.1.6 allows strtoul (unsigned
- long) to have an optional negative sign. A negative unsigned long?
- Borland 3.1 strtoul converts a negative long to an unsigned number
- without error. But I prefer to trap any negative numbers input to
- unsigned fields. So str2ul is a wrapper function for strtoul that
- first tests for a negative number and if one is found, flags the data
- as invalid and returns zero.
-
- The test suite includes -0 as a data value. The strtol function traps
- this as an ERANGE error and returns the overflow limit. The rfixi and
- rfixl functions substitute zero.
-